Learning-Graph-Based Quantum Algorithm for /c-distinctness 

Aleksandrs Belovs* 



Abstract 

We present a quantum algorithm solving the fc-distinctness problem in O [ n n ; J queries 

with a bounded error. This improves the previous 0(n fc,/ ' fc+1 ')-query algorithm by Ambainis. The 
construction uses a modified learning graph approach. Compared to the recent paper by Belovs and 
Lee [7J, the algorithm doesn't require any prior information on the input, and the complexity analysis 
is much simpler. 

Additionally, we introduce an 0( v / na 1 '' 6 ) algorithm for the graph collision problem where a is 
the independence number of the graph. 

1 Introduction 

The element distinctness problem consists of computing function /: [m] n — > {0, 1} that evaluates to 1 
iff there is a pair of equal elements in the input, i.e., f(x\, . . . ,x n ) = 1 iff 3i ^ j : Xi = Xj. (Here we 
use notation [n] = {1, 2, . . . , n}.) The quantum query complexity of the element distinctness problem is 
well understood. It is known to be 0(n 2 / 3 ), with the algorithm given by Ambainis and the lower 
bound shown by Aaronson and Shi [T] and Kutin [TS] for the case of large alphabet size 0(n 2 ), and by 
Ambainis [3] in the general case. 

Ambainis' algorithm for the element distinctness problem was the first application of the quantum 
random walk framework to a "natural" problem (i.e., one seemingly having little relation to random 
walks), and it had significantly changed the way quantum algorithms have been developed since then. 
The core of the algorithm is quantum walk on the Johnson graph. This primitive has been reused in 
many other algorithms: triangle detection in a graph given by its adjacency matrix [23], matrix product 
verification |13j . restricted range associativity |15j . and others. Given that the behavior of quantum walk 
is well- understood for arbitrary graphs [UJ 122]) it is even surprising that the applications have been 
mostly limited to the Johnson graph. 

The k- distinctness problem is a direct generalization of the element distinctness problem. Given the 
same input, the function evaluates to 1 iff there is a set of k input elements that are all equal, i.e., a set 
of indices ai, . . . , a& € [n] with a; ^ cij and x ai = x aj for all i ^ j. 

The situation with the quantum query complexity of the /c-distinctness problem is not so clear. (In 
this paper we assume k = O(l), and consider the complexity of fc-distinctness as n — > oo.) As element 
distinctness reduces to /c-distinctness by repeating each element k — 1 times, the lower bound of £!(n 2 / 3 ) 
carries over to the fc-distinctness problem (this argument is attributed to Aaronson in Ref. [3]). This 
simple lower bound is the best known so far. 

In the same paper [1] with the element distinctness algorithm, Ambainis applied quantum walk on 
the Johnson graph in order to solve the fc-distinctness problem. This resulted in a quantum algorithm 
with query complexity 0(n k '( k+1 >). This was the best known algorithm for this problem prior to this 
paper. 

The aforementioned algorithms work by searching for a small subset of input variables such that the 
value of the function is completely determined by the values within the subset. For instance, the values 
of two input variables are sufficient to claim the value of the element distinctness function is 1 , provided 
their values are equal. This is formalized by the notion of certificate complexity as follows. 

An assignment for a function f:T>—$ {0, 1} with T> C [m] n is a function a: S — > [m] with S C [n]. 
The size of a is An input x = (x^ € [to]™ satisfies assignment a if a(i) = xi for all i £ S. An 
assignment a is called a b-certificate for /, with b S {0, 1}, if f(x) — b for any x £ V satisfying a. The 
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certificate complexity C x (f) of / on x is defined as the minimal size of a certificate for / that x satisfies. 
The 6-certificate complexity C^(f) is defined as max l6 j-i(f,) C x (f). Thus, for instance, 1-certificate 
complexity of element distinctness is 2, and 1-certificate complexity of triangle detection is 3. 

Soon after the Ambainis' paper, it was realized [14] that the algorithm developed for fc-distinctness 
can be used to evaluate, in the same number of queries, any function with 1-certificate complexity equal 
to fc. Now we know that for some functions this algorithm is tight, due to the lower bound for the fc-sum 
problem [3]. The goal of the fc-sum problem is to detect, given n elements of an Abelian group as input, 
whether there are fc of them that sum up to a prescribed element of the group. The fc-sum problem is 
noticeable in the sense that, given any (fc — l)-tuple of input elements, one has absolutely no information 
on whether they form a part of an (inclusion-wise minimal) 1-certificate, or not. 

The aforementioned applications of the quantum walk on the Johnson graph (triangle finding, etc.) 
went beyond 0(n k > ( fe + 1 )) upper bound by utilizing additional relations between the input variables: the 
adjacency relation of the edges for the triangle problem, row-column relations for the matrix products, 
and so on. For instance, two edges in a graph can't be a part of a 1-certificate for the triangle problem, 
if they are not adjacent. 

The fc-distinctness problem is different in the sense that it doesn't possess any structure of the 
variables. But it does possess a relation between the values of the variables: two elements can't be a 
part of a 1-certificate if their values are different. However, it seems that quantum walk on the Johnson 
graph fails to utilize this structure efficiently. 

In this paper, we use the learning graph approach to construct a quantum algorithm that solves the 

fc-distinctness problem in O (n 1 ^ 2 /(2 L -i)^ queries. Note that O {n 1 ~ 2k 2 i /( 2 '"~ 1 )^ — o(n 3 / 4 ). Thus, 

our algorithm solves fc-distinctness, for arbitrary fc, in asymptotically less queries than the best previously 
known algorithm solves 3-distinctness. 

The learning graph is a novel way of construction quantum query algorithms. Somehow, it may 
be thought as a way of designing a more flexible quantum walk than just on the Johnson graph. And 
compared to the quantum walk design paradigms from Ref. [251122) . it is easier to deal with. In particular, 
it doesn't require any spectral analysis of the underlying graph. 

Up to date, the applications of learning graphs are as follows. Belovs [6] introduced the framework and 
used it to improve the query complexity of triangle detection. Zhu |26j and Lee, Magniez and Santha |20) 
extended this algorithm to the containment of arbitrary subgraphs. Belovs and Lee [7] developed an 
algorithm for the fc-distinctness problem that beats the 0(n fc// ( fc+1 ))-query algorithm given some prior 
information about the input. Belovs and Reichardt [5] use a construction resembling learning graph to 
obtain an optimal algorithm for finding paths and claws of arbitrary length in the input graph. Also, 
they deal with time-efficient implementation of learning graphs. 

The paper is organized as follows. In Section [5] we define the (dual of the) adversary bound. It is 
the main technical tool underlying our algorithm. Also, we describe learning graphs and the previous 
algorithm for the fc-distinctness problem. In Section [3J we describe the intuition behind our algorithm, 
and describe the changes we have made to the model of the learning graph. In Section [4] we give 
an algorithm for the graph collision problem as a preparation for the fc-distinctness algorithm that we 
describe in Sections [S] and [U Strictly speaking, Sections from l2.2l tol4"lare not necessary for understanding 
the fc-distinctness algorithm: the proof in Sections [5] and [B] rely on Theorem [2] only. However, these 
sections are necessary for understanding the intuition behind the algorithm. 

2 Preliminaries 

In this paper, we are mainly concerned with query complexity of quantum algorithms, i.e., we measure 
the complexity by the number of queries to the input the algorithm makes in the worst case. For the 
definition of emery complexity and its basic properties, a good reference is [12) . 

In Section 12.11 we describe a tight characterization of the query complexity by a relatively simple 
semi-definite program (SDP): the adversary bound, Eq. (fT]). This is the main technical tool underlying 
our algorithm. 

Although Eq. (TTJ) is an SDP, and thus can be solved in polynomial time in the size of the program, 
the latter is exponential in the number of variables, and becomes very hard to solve exactly as its size 
grows. The learning graph [6 is a tool for designing feasible solutions to Eq. (fTJ, whose complexity is 
easier to analyze. We define it in Sections 12.21 and 12.31 In the first one, we describe the model following 
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Ref. [51 [7]- In the second one, we describe a common way of constructing learning graphs for specific 
problems, and give an example of a learning graph for the fc-distinctness problem corresponding to the 
Ambainis' algorithm. 



2.1 Dual adversary bound 

The adversary bound, originally introduced by Ambainis [2], is one of the most important lower bound 
techniques for quantum query complexity. A strengthening of the adversary bound, known as the general 
adversary bound |16) , has recently been shown to characterize quantum query complexity, up to constant 
factors [21I2T]. 

The (general) adversary bound is a semi-definite program, and admits two equivalent formulations: 
the primal, used to prove lower bounds; and the dual, used in algorithm construction. We use the latter. 

Definition 1. Let /: Z> — > {0, 1} with V C [m] n be a function. The adversary bound Adv ± (/) is defined 
as the optimal value of the following optimization problem: 

minimize max ) XAx, xj (la) 

subject to y? . . Xjlx,y} = l whenever f(x) f f(y); (lb) 

XjhO for all i e [ra]; (lc) 

where the optimization is over positive semi-definite matrices Xj with rows and columns labeled by the 
elements of T>, and X\x, y] is used to denote the element of matrix X on the intersection of the row and 
column labeled by x and y, respectively. 

The general adversary bound characterizes quantum query complexity. Let Q{f) denote the query 
complexity of the best quantum algorithm evaluating / with a bounded error. 

Theorem 2 ( [H] [MJ [2T] ) . Let f be as above. Then, Q(f) = 9(Adv ± (/)). 



2.2 Learning graphs: Model-driven description 

In this section we briefly introduce the simplest model of learning graph following Ref. [5J U\ ■ 

Definition 3. A learning graph Q on n input variables is a directed acyclic connected graph with vertices 
labeled by subsets of [n], the input indices. It has arcs connecting vertices labeled by S and SU{j} only, 
where S C [n] and j £ [n] \ S. The root of Q is the vertex labeled by 0. Each arc e is assigned positive 
real weight w e . 

Note that it is allowed to have several (or none) vertices labeled by the same subset S C [n] . If there 
is unique vertex of Q labeled by 5, we usually use S to denote it. Otherwise, we denote the vertex by 
(S, a) where a is some additional parameter used to distinguish vertices labeled by the same subset S. 

A learning graph can be thought of as a way of modeling the development of one's knowledge about 
the input during a query algorithm. Initially, nothing is known, and this is represented by the root labeled 
by 0. At a vertex labeled by S C [n], the values of the variables in S have been learned. Following an 
arc e connecting vertices labeled by S to SU {j} can be interpreted as querying the value of variable Xj. 
We say the arc loads element j. When talking about a vertex labeled by S, we call S the set of loaded 
elements. 

The graph Q itself has a very loose connection to the function being calculated. The following notion 
is the essence of the construction. 

Definition 4. Let Q be a learning graph on n input variables, and /:!?—> {0, 1} be a function with 
domain D C [m] n . A flow on Q is a real- valued function p e (x) where e is an arc of Q and x G / _1 (1). 
For a fixed input x, the flow p e = p e {x) has to satisfy the following properties: 

• vertex is the only source of the flow, and it has value 1. In other words, the sum of p e over all e 
leaving is 1; 

• a vertex labeled by S is a sink iff it contains a 1-ccrtificatc for / on input x. Such vertices are 
called accepting. Thus, if S ^ and S is not accepting then, for a vertex labeled by S 1 , the sum of 
p e over all in-coming arcs equals the sum of p e over all out-going arcs. 
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We always assume a learning graph Q is equipped with a function / and a flow p that satisfy the 
constraints of Definition 2) Define the negative complexity of Q and the positive complexity for input 
x € as 

C°(S) = y> e and C l (Q,x) = J2?^ ) (2) 

eG_B eS-E 

respectively, where -E is the set of arcs of Q . The positive complexity and the (total) complexity of (J are 
defined as 

C 1 {G) = max C 1 ^) and C{Q) = max{C° (G) (G)} , (3) 

xef-Hi) 

respectively^ The following theorem links learning graphs and quantum query algorithms: 

Theorem 5 ([7]). Assume G is a learning graph for a function f : T> — > {0, 1} with T> C [m] n . Then 
there exists a bounded-error quantum query algorithm for the same function with complexity 0(C(G))- 

Proof sketch. We reduce to Theorem [5J For each arc e from S to S U {j}, we define a block-diagonal 
matrix X| = Yq, where the sum is over all assignments a on S. Each Y" Q is defined as ipip* where, 
for each z £ T>: 

!Pe{z) I \/vJ^, f(z) = 1, and z satisfies a; 
tjwl, f(z) = 0, and z satisfies a; 

0, otherwise. 

Finally, we define Xj in ([IJ as ^ e X? where the sum is over all arcs e loading j. 

Condition (fTc)) is trivial, and the expression for the objective value ([Tall is straightforward to check. 
The feasibility (jlbft is as follows. Fix any x E and y € / _1 (0). By construction, j/] = p e (£)) 

if xs = 2/s where S is the origin of e; otherwise, it is zero. Thus, only arcs e from 5 to S*U {j}, such that 

= US a n d £j 7^ yj, contribute to the sum in (jlb[) . These arcs define a cut between the source and 
all the sinks of the flow p e — p e (x), hence, the total value of the flow on these arcs is 1, as required. □ 



2.3 Learning graphs: Procedure-driven description 

In this section, we describe a way of designing learning graphs that was used in Ref. [B] and other papers. 
The learning graph, introduced in Section [521 may be considered as a randomized procedure for loading 
values of the variables with the goal of convincing someone the value of the function is 1. For each 
input x e the designer of the learning graph builds its own procedure. The goal is to load a 

1-certificate for x. Usually, for each positive input, one specific 1-certificate is chosen. The elements 
inside the certificate are called marked. The procedure is not allowed to err, i.e., it always has to load 
all the marked elements in the end. The value of the complexity of the learning graph arises from the 
interplay between the procedures for different inputs. 

We illustrate this concepts with an example of a learning graph corresponding to the fc-distinctness 
algorithm by Ambainis [3]. Fix a positive input x, i.e., one evaluating to 1. Let M = {a\, a-i . . . , be 
such that x ai = x a2 = ■ ■ ■ = x ak . It is a 1-ccrtificate for x. The elements inside M are marked. One 
possible way of loading the marked elements consists of k + 1 stage and is given in Table [TJ The internal 
randomness of the procedure is concealed in the choice of the r elements on stage I. (Here r = o(n) is 

some parameter to be specified later.) Each choice has probability q = (™^ fc ) 



I. 


Load r elements different from a±, . . 




II. 1 


Load a\. 




II.2 


Load a2- 




Il.fc 


Load afe. 





Table 1: Learning graph for the /c-distinctness problem corresponding to the algorithm from Ref. [4]. 

x Ref. [6] defines C(G) as ^ C° (G)C 1 (G) . Both definitions are equivalent, because one may make both C°{Q) and C 1 (5) 
equal to ^/C° (Q)C 1 (Q) by simultaneously scaling the weights of all the arcs by an appropriate coefficient. 
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Let us describe how a graph Q and flow p is constructed from the description in Table [T] At first, 
we define the key vertices of Q . If d is the number of stages, the key vertices are Vq U • • • U Vd, where 
Vq = {0} and Vi consists of all possible sets of variables loaded after i stages. 

For a fixed input x and fixed internal randomness, the sets S^_i € Vi~\ and Si G K: of variables 
loaded before and after stage i, respectively, are uniquely defined. In this case, we connect S^_i and Si 
by a transition eH For that, we choose an arbitrary order t±, . . . , tt of elements in Si \ Sj_i, and connect 
Si-i and Si by a path: 

Sj_i, U {ii},e), U {ti,t 2 },e), . . . , (Sj \ {**},e),5j 

in Q. Here, additional labels e in the internal vertices assure that the paths corresponding to the 
transitions do not intersect, except at the ends. We say transition e and all arcs therein belong to stage 
i. 

In the case like in the previous paragraph, we say the transition e is taken for this choice of x and 
the randomness. We say a transition is used for input x, if it is taken for some choice of the internal 
randomness. The set of transitions of Q is the union of all transitions used for all inputs in For 
instance, stage II. 2 of the learning graph from Table Q] consists of all transitions from S to SU{j} where 
| SI = r + 1 and j £ S. For an example refer to Figure [1] 




Figure 1: The learning graph for fc-distinctness from Table Q] in the case k — 2, n — 5 and r = 2. Stages 
I, II. 1 and II. 2 shown. 

The flow p e (x) is defined as the probability, over the internal randomness, that transition e is taken 
for input x. All arcs forming the transition are assigned the same flow. Thus, the transition e is used by 
x iff Pe(x) > 0. In the learning graph from Table [TJ p e (x) attains two values only: and q. 

So far, we have constructed the graph Q and the flow p. It remains to define the weights w e . This is 
done using Theorem |S] below. But, for that, we need some additional notions. 

The length of stage i is the number of variables loaded on this stage, i.e., |Sj \ S%—i\ for a transition 
e from Sj_i to Sj of stage i. In our applications in this paper this number is independent on the choice 
of e. We say the flow is symmetric on stage i if the non-zero value of p e {x) is the same for all e on stage 
i and all i|j The flow in the learning graph from Table [T] is symmetric. 

If the flow is symmetric on stage i, we define the speciality Tj of stage i as the ratio of the total 
number of transitions on stage i, to the number of ones used by x. In a symmetric flow, this quantity 
doesn't depend on x. 

Finally, we define the (total) complexity of stage i, Ci(G), similarly as C{Q) is defined in @ and ([3]) 
with the summation over Ei, the set of all arcs on stage i, instead of E. It is easy to see that C(Q) is at 
most J2i C i(G)- 

Theorem 6 ([B]). If the flow is symmetric on stage i, the arcs on stage i can be weighted so that the 
complexity of the stage becomes Li^/Tl. 

Proof sketch. Let q be the non-zero value of the flow on stage i. Assign weight q/\fTi to all arcs on stage 
i. □ 

2 In Ref. |2, the graph formed by the key vertices and the transitions is called reduced learning graph. 
3 This is a less general definition than in Ref. [G], but it suffices for our purposes. 
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Now we are able to calculate the complexity of the learning graph in Table [JJ The length of stage I 
is r, and the length of stage II. i is 1 for all i. It is also not hard to see that the corresponding specialities 
are O(l) and 0(n l /r' i_1 ). For example, a transition from S to S U {j} on stage II. k is used by input x 
iff a%, . . . , dfc_i € 5 and j = a^. For a random choice of S 1 and j ^ S 1 , the probability of j — is 1/n, 
and the probability of a\,.. . , <ifc-i € 5, given j = a,k, is f2(r /n ). Thus, the total probability is 
fi(r fc_1 /n fc ) and the speciality is the inverse of that. 

Thus, the complexity of the algorithm, by Theorems [5] and [5J is 0(r + y/ n k /r^" 1 ). It is optimized 
when r = n fe ^ fe+1 \ and the complexity is 0(n k ^ k+lS> ). 

3 Outline of the algorithm 

In this section we describe how the learning graph from Table [T] is transformed into a new learning graph 
with a better complexity. Many times when learning graphs were applied to new problems, they were 
modified accordingly [5J[7J[5]. This paper is not an exception, thus, we also describe the modifications 
we make to the model of a learning graph. 

The main point of the learning graph in Table [T] and similar ones is to reduce the speciality of the last 
step, loading a^. In the learning graph from Table [TJ it is achieved by loading r non-marked elements 
before loading the certificate. This way, the speciality of the last step gets reduced from 0{n k ) to 
0(n k /r k ~ 1 ). We say that oi, . . . , tifc-i are hidden among the r elements loaded on stage I. The larger 
the set we hide the elements into, the better. 

Unfortunately, we can't make r as large as we like, because loading the non-marked elements also 
counts towards the complexity. At the equilibrium point r = n k ^ k+1 \ we attain the optimal complexity 
of the learning graph. 

In Ref. £7] a learning graph was constructed with better complexity. It uses a more general version of 
the learning graph than in Section 12.21 with weights of the arcs dependent on the values of the element 
loaded so far. Its main idea is to hide a\,... , dfe-i as one entity, not k — 1 independent elements. By 
gradually distilling vertices of the learning graph having large number of (k— l)-tuples of equal elements, 
the learning graph manages to reduce the speciality of the last step without increasing the number of 
elements loaded, because {ai, . . . ,ak~i} gets hidden among a relatively large number of (k — l)-tuples 
of equal elements. 

But this learning graph has serious drawbacks. Due to dealing with the values of the variables in the 
distilling phase, the flow through the learning graph ceases to be symmetric and depends heavily on the 
input. This makes the analysis of the learning graph quite complicated. What is even worse, the learning 
graph requires strong prior knowledge on the structure of the input to attain reasonable complexity. 

In this paper we construct a learning graph that combines the best features of both learning graphs. 
Its complexity is the same as in Ref. [7]. Also, it has the flow symmetric and almost independent on the 
input, like the one in Table [TJ This has three advantages compared to the learning graph in Ref. [TJ: 
its complexity is easier to analyze, it doesn't require any prior information on the input, and it is more 
suitable for a time-efficient implementation along the lines of Ref. [5] . This is achieved at the cost of a 
more involved construction. 

Let us outline the modifications the learning graph from Table [TJ undergoes in order to reduce the 
complexity. Again, we assume x is a positive input, and M — {a\, . . . , a^} is such that x ai = ■ ■ ■ = x ak . 

1 . We achieve a symmetric flow with smaller speciality of the last step by finding a way to load more 
non-marked elements in the first stages of the learning graph. There is an indication that it is 
possible in some cases: the values of r Boolean variables can be learned in less than r queries, if 
there is a bias between the number of ones and zeros |10| . More precisely, if the number of ones is 
£, the values can be loaded in 0(y/r£) queries. 

2. We start with dividing the set S of loaded elements into k subsets: S = Si U ••• U Sfc-i, where U 
denotes disjoint union. Set Si has size r» = o(n). We use Si to hide a* when loading a*;. This step 
doesn't reduce the speciality, but this division will be necessary further. 

3. Consider the situation before loading afc. If an element j € S*2 is such that Xj ^ Xt for all t G Si, 
this element cannot be a part of the certificate (i.e., it can't be 02), and its precise value is irrelevant. 
(This is the place where we utilize the relations between the values of the variables as mentioned in 



G 





1.1 


Load a set Si of r\ elements not from M. 






T 9 


Load a set S2 of elements not from M, uncovering 


; only those elements that have 






a match in S\. 






1.3 


Load a set S3 of r 3 elements not from M, uncovering 


; only those elements that have 






a match among the uncovered elements of S2 ■ 




l.(k 




Load a set Sk-i of rk-i elements not from M, uncovering only those elements that 






have a match among uncovered elements of Sk-2- 






11.1 


Load ai and add it to Si. 




ll.(k 


-1) 


Load afc_i and add it to Sk-i- 






ILfc 


Load a/j. 





Table 2: An illustrative (not correct) version of the learning graph for fc-distinctness 



the introduction.) In this case, we say j doesn't have a match in Si, and represent it by a special 
symbol *. Otherwise, we uncover the element, i.e., load its precise value. Similarly, when loading 
Si with i > 2, we uncover those elements only that have a match among the uncovered elements 
of Si-i. 

4. Usually, the number of elements in Si having a match in Sj_i is much smaller than the total number 
of elements in Si. Similarly to Point [TJ we can reduce the complexity of loading elements in Si 
because of this bias. Thus, we have n = w(ri), while the complexity of loading remains 0(r{). 
Now we have more elements to hide Oj in between, hence, the speciality of loading Ofc gets reduced. 

5. When loading ak, we do want etj to be in Si for i G [k — 1], because that is where we hide them. 
On the other hand, in order to keep the speciality of loading non- marked elements in Si, ... , Sk-i 
equal to 0(1), we would like to add a\ to Si only after all elements in S&_i have been already 
loaded. Thus, we load a%, . . . , ak-i between these two stages and put them in Si, ... , Sk-i- This 
is summarized in Table [5J 

6. Since the uncovering of elements in Si, for i > 1, depends on the values contained in Sj with j < i, 
adding etj to Si afterwards is a bit of cheating. This does cause some problems we describe in more 
detail in Section 15.31 We describe a solution in Section \E[ 

In order to account for these changes, we use the following modifications to the learning graph model. 

A. In Section IH1 we are forced to drop the flow notion from Definition We use Theorem [5] directly, 
borrowing some concepts from the proof of Theorem [5] Namely, the notion of a vertex and an arc 
leaving it. Also, we keep the internal randomness intuition from Section ^. 31 The loading procedure 
still doesn't err in some sense formalized in (fl4^) . 

B. We change the way the vertices of the learning graph are represented. Firstly, we keep track 
to which Si each loaded element belongs, like said in Point [2J Also, we assume the condition 
on uncovering of elements, and use the special symbol * as a notation for a covered element, as 
described in Point [3] Technically, this corresponds to modification of the definition of an assignment 
a in Y a in the proof of Theorem [5l 

C. Instead of having a rank-1 matrix Y a as in the proof of Theorem [5l we define it as a rank-2 matrix. 
The weight of the arc depends now on the value of the variable being loaded as well, although in 
a rather restricted form. Thus, we are able to make use of the bias as described in Point HI and to 
account for the introduction of * in Point [3J 

4 The reader should not be confused by our earlier statement that the flow is symmetric, because when considering one 
stage, the part of the "flow" is still symmetric. It only is not defined where it comes from, and where it goes afterwards. 
See also Footnote [9] 
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The remaining part of the paper is organized as follows. In Section 0] we give a learning graph for 
the graph collision problem that uses some ideas from above (Points [Tl and [Cj) . In Sections [5] and [5] we 
describe the algorithm for fc-distinctness. In order to simplify the exposition, we first give a version of 
the learning graph from Tabic [2] that illustrates the main idea of the algorithm, but has a flaw. We 
identify it in Section 15.31 and then describe a work-around in Section [6] The complexity analysis of the 
second algorithm is analogous to the first one, so we do it for the first algorithm. 

4 Warm-up: Graph collision 

In order to get ready for the fc-distinctness algorithm, we start with a learning graph for the graph collision 
problem with an additional promise. It is a learning graph version of the algorithm by Ambainis [5]- 

The graph collision problem is one of the ingredients of the triangle finding quantum algorithm by 
Magniez et al. 23 and the learning-graph-based quantum algorithm by Belovs [5] . It is also used in the 
algorithm for boolean matrix multiplication by Jeffery et al. |17) . 

The problem is parametrized by a simple graph G on n vertices. The input is formed by n boolean 
variables: one for each vertex of the graph. The function evaluates to 1 if there exists an edge of G with 
both endpoints marked by value 1, and to otherwise. 

The best known quantum algorithm solving this problem for a general graph G uses 0(n 2 / 3 ) queries. 
For specific classes of graphs one can do better. For instance, if G is the complete graph, graph collision 
is equivalent to the 2-threshold problem that can be solved in 0(y/n) queries by two applications of the 
Grover algorithm. The algorithm in this section may be interpreted as an interpolation between this 
trivial special case and the general case. 

Recall that the independence number a(G) of a simple graph G is the maximal cardinality of a subset 
of vertices of G such that no two of them are connected by an edge. 

Theorem 7. Graph collision on graph G can be solved in 0{\fna 1 / & ) quantum queries with bounded 
error, where a = a(G) is the independence number of G. 

Note that if G is a complete graph, a(G) = 1, and we get the previously mentioned 0( v /n)-algorithm 
for this trivial case. In the general case, a(G) = O(n), and the complexity of the algorithm is 0(n 2 / 3 ) 
that coincides with the complexity of the algorithm for a general graph. 

Jeffery et al. [17] build a quantum algorithm solving graph collision on G in 0(\/n + \frn) queries if 
G misses m edges to be a complete graph. This algorithm is incomparable to the one in Theorem [7J for 
some graphs the algorithm from Theorem [7] performs better, for some graphs, vice versa. 

Proof of Theorem^ Let / be the graph collision function specified by graph G. The first step of the 
algorithm is quantum counting [11] . We distinguish the case when the number of ones in the input is 
at most a, and when it is at least 2a. In the intermediate case, the counting subroutine is allowed to 
return any of the outcomes. The complexity of the subroutine is 0(y/n). 

If we know, with high probability, that the number of ones is greater than a, we may claim that 
graph collision exists. Otherwise, we may assume the number of ones is at most 2a. In this case, we 
execute the following learning graph Q. 

The learning graph is essentially the learning graph from Tabic [T] for 2-distinctness. Let us denote, 
for simplicity, a = a\ and b = a^. Then, instead of loading a and b such that x a = Xf,, the graph collision 
learning graph loads a and b such that x a — xi, — 1 and ab is an edge of G. We reduce the complexity 
of the learning graph by utilizing the bias between the number of zeros and ones induced by the small 
independence number, as outlined in Points [T1 and [Cl of Section [31 

One could prove the correctness of the algorithm completely analogously to the correctness proof of 
the algorithm from Section 12.31 However, in the preparation for future discard of the notion of flow 
(Point 1X1 from Section [3]), we use language from Section [51 The reader is encouraged to compare both 
ways of the proof. 

Let a; be a positive input, and let a and b be such that x a = Xj, = 1 and ab is an edge of G. Set 
M = {a, b} is a 1-certificate for x. 
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The key vertices of the learning graph are Vi U V2, where V\ and V2 consist of all subsets of [n] of 
sizes r and r + 1, respectively, where r — o(n) is some parameter to be specified later 

A vertex in V\ completely specifies the internal randomness. For each R £ V\, we fix an arbitrary 
order of its elements: R — {ti, . . . , t r }. We say the choice of randomness R £ V\ is consistent with x if 
{a, b} n R = 0. For each 2 £ there are exactly (™~ 2 ) choices of iZ € V\ consistent with x. We 

take each of them with probability q = (™~ 2 ) 

For a fixed input x and fixed randomness R = {ti, . . . ,t r } £ v\ consistent with x, the elements are 
loaded (we are going to define what this means later) in the following order: 



ti 



(Z, t r -\ 



(4) 



The non-key vertices of Q are of the form v = {{ti, . . . ,tz},R), where < £ < r, R £ Vi, and ti 
are from (H|). Recall that, as stated in Section T2.21 the first element of the pair is the set of loaded 
elements, and the second one is an additional mark used to distinguish vertices with the same set of 
loaded elements. 

An arc of the learning graph is a process of loading one variable. We denote it by A!j. Here, j is the 
variable the arc loads, and v is a vertex of Q it originates in. In our case, the arcs are as follows. The 
arcs of the stage I have v — ({ti, . . . , tg\, R) and j = tg+i with < I < r|j The arcs of stages II. 1 and 
II. 2 have v = S, with S £ Vi and S £ V2, respectively, and j ^ S. 

For a fixed x £ and fixed internal randomness R £ Vi consistent with x, the arcs taken are 



A {{ti 



for < £ < 



A, 



and 



.4 



fiU{a} 



(5) 



Recall, we say x satisfies an arc if the arc is taken for some R £ V\ consistent with x. Note also, no arc 
is taken for two different choices of the randomness. 

Like in the proof of Theorem [3 for each arc Aj, we assign a matrix XJ y 0. Then, Xj in (|TJ) are 
given by .Y, V . AT. 

Fix A^ : and let S be the set of loaded elements. Recall that an assignment on S as a function 
a: S — ¥ {0, 1}. An input z £ {0, 1}™ satisfies assignment a iff z t = a(t) for each t £ S. We say inputs 
x and y agree on S, if they satisfy the same assignment a. Let AJ = J2 a Y a where the sum is over all 
assignments a on S. The matrix Y a is defined as q(ipip* + <j>(f>*), where, for each z £ {0, 1}™, 



*!>[*] 



m = 1 



1, 



l/VW' z satisfies a and the arc A" 



[0, 



f(z) = 0, Zj = 0, 
and z satisfies a; 
otherwise; 



and 



' v °' z satisfies a and the arc A*j: 

f(z) =0,^ = 1, 
and z satisfies a; 
otherwise. 



w , 



[0, 



Here uuq and wi are parameters to be specified later (the weights of the arc). They depend only on the 
stage the arc belongs to. In other words, AJ consists of the blocks of the following form: 





Xj = 1 


Xj = 


Vj = 1 


% = o 


Xj 


= 1 


q/wi 








q 


Xj 


= 





q/wa 


q 





Vj 


= 1 





q 


qw 





Vj 


= 


q 








qwi 



(6) 



Here each of the 16 elements corresponds to a block in Y a with all entries equal to this element. The 
first and the second columns represent the elements from / _1 (1) that satisfy a and Aj, and such that 
their jth element equals 1 and 0, respectively. Similarly, the third and the fourth columns represent 
elements from / _1 (0) that satisfy a and such that their jth element equals 1 and 0, respectively. This 
construction is due to Robin Kothari [TBI. 



5 Compared to the learning graph for 2-distinctness from Section 12.31 we do not have Vq and V3. The reason for the 
absence of Vq is described in Footnote [6] Set V3 is omitted because no arc originates there, hence, by the view of Point ^ 
from Section [3] it is of no importance for us. 

6 These arcs may be considered as in transitions from to the elements of V\. In order to obtain a learning graph 
similar to the one in Figure [T] one has to merge all vertices of the form (0, R) into one vertex forming Vq. 
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4.1 Feasibility 



Assume x and y are inputs such that f(x) = 1 and f(y) = 0. Let R g Vi be a choice of the internal 
randomness consistent with x. Let 2Tj be the matrix corresponding to the arc loading j that is taken for 
the input x and randomness R. I.e., Zj is either the matrix of ([5]) with sub-index j, or Zj =0, if there 
are none, i.e., when j ' ^ R U {a, b}. We are going to prove that 

E z A^yl = i- (7) 

(This is what we meant by saying in Point [A] of Section [3] that the learning graph doesn't err for all 
choices of the internal randomness.) Since there are (™~ 2 ) choices of R consistent with x, and no arc is 
taken for two different choices of the randomness, this proves the feasibility condition in (llbp . 

Consider the order Q in which elements are loaded for this particular choice of x and R. Before any 
element is loaded, both inputs agree (they satisfy the same assignment a: — > {0, 1}). After all elements 
are loaded, x and y disagree, because it is not possible that y a — x a and yi = xj,. With each element 
loaded, the assignments become more specific. This means that there exists an element j = ti such that 
x and y agree before loading j, but disagree afterwards. In particular, Xj ^ yj. By construction, this j 
contributes q to the sum in ([7]). All other j contribute to the sum. Indeed, if j' — te with i' < i then 
Xji = yji, hence, j' contributes 0. For j' = tii with i' > i, x and y disagree on {t±, . . . ,tj'_i}, hence, 
Zji [x, y] = by construction. 



4.2 Complexity 

Similarly to Section |2~31 let us define the complexity of stage i on input z £ {0, 1}" as YljeM X'j{z, z}, 
where Xj =Y2v ^-j w ith the sum over v such that belongs to stage i. Also, define the complexity of 
stage i as the maximum complexity over all inputs z £ {0, 1}". Clearly, the objective value (|la[) of the 
whole program is at most the sum of the complexities of all stages. 

Let us start with stages II. 1 and 11.20 For any x € ,/ _1 (l), on both stages II. 1 and II. 2 there are 
("~ 2 ) arcs satisfying it. These are arcs and A^ u ^ a \ respectively, for all choices of R € V± consistent 
with x. By ©, each of them contributes q/wi to the complexity of x on stages II. 1 and II. 2, respectively. 
Since, we are guaranteed that Xj = 1 in notations from ©, we may set wq = 0. 

The total number of arcs on stages II. 1 and II. 2 are (n— ?")(") and (n— r— l)^?]} , respectively. Each 
of them contributes at most qw\ to the complexity of any y £ / _1 (0) on stages II. 1 and II. 2, respectively. 

Thus, the complexities of stages II. 1 and II. 2 on any x £ is ("~ )q/wi — \jw\. On any 

y £ / _1 (0), it is at most (n — r){^qw\ = 0(nw\) and (n — r — l)( r ," 1 )g , u;i = 0(n 2 u>i/r), respectively. 
If we set wi equal to on stage II. 1 and to y/r/n on stage II. 2, the complexities of these stages 

become 0{^/n) and 0{n/^/r), respectively. 

Consider stage I now. Let k be the number of variables with value 1 in the input [x or y). The total 
number of arcs on this stage is Out of them, exactly fc( n Zi) load a variable with value 1. Thus, 

for y £ / _1 (0), the complexity of stage I is 

qr(jwo + qk( n J j wi = O (rw + ~^ w i 



Similarly, for x £ f the complexity of stage I is 0(r/wi + kr/(nwo)). If we set wq = ^/a/n and 

wi = v/ n/a then, since k < 2a, the complexity of stage I becomes 0(r^k/n). The total complexity of 
the learning graph is 



7!+^)=°(^ 1,8 )< 



o 

if r = na" 1 / 3 '. □ 



7 For stages II. 1 and II. 2, the complexity of the stage can be calculated using Theorem [6] like in Section 12.31 For stage 
II. 1, the length is 1, and the speciality is 0(n). For stage II. 2, the length is 1, and the speciality is 0(n 2 /r). Hence, the 
complexities are 0(y/n) and 0(n/y/r), respectively. 
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5 Algorithm for ^-distinctness: First attempt 



The aim of this and the next sections is to prove the following theorem: 

Theorem 8. For arbitrary but fixed integer fc > 2, the fc- distinctness problem can be solved by a quantum 
computer in O ^n 1-2 ^ 2 queries with a bounded error. 

As mentioned in Section [31 we do not rely on previous results like Theorem [BJ in the proof, and use 
Theorem [2] directly. The construction of the algorithm deviates from the graph representation: a bit in 
Section and quite strongly in Section [5] However, we keep the term "vertex" for an entity describing 
some knowledge of the values of the input variables, and the term "arc" for a process of loading a value 
of a variable (possibly, only partially). Each arc originates in a vertex, but we do not specify where it 
goes. Inspired by Section [2. 31 the vertices are divided into key ones denoted by the set of loaded variables 
S with additional structure. The non-key vertices are denoted by (S, R) where S is the set of loaded 
variables, and R is an additional label used to distinguish vertices with the same S, as described in 
Section |2~2"1 Also, we use the "internal randomness" term from Section |2~51 

Throughout Sections [5] and [6j let / : [m] n — > {0,1} be the fc-distinctness function. The section is 
organized as follows. In Section |5"7TI we rigorously define the learning graph from Tabled in Section [5^1 
analyze its complexity; and, finally, describe the flaw mentioned in Point [B] of Section [3J in Section 15.31 

Similarly to the analysis in Ref. [3] , we may assume there is unique fc-tuple of equal elements in any 
positive input |f| One of the simplest reductions to this special case is to take a sequence Ti of uniformly 
random subsets of [n] of sizes (2fc/(2fc + l)) l n, and to run the algorithm, for each i, with the input 
variables outside removed. One can prove that if there are k equal elements in the input then there 
exists i such that, with probability at least 1/2, Ti will contain unique fc-tuple of equal elements. The 
complexities of the executions of the algorithm for various i form a geometric series, and their sum is 
equal to the complexity of the algorithm for i = up to a constant factor. Refer to Ref. [3] for more 
detail and alternative reductions. 

5.1 Construction 

The construction of the learning graph Q for fc-distinctness is similar to the one in Theorem [7j Let x 
be a positive input, and let M = {ai, 02, ... , a^} denote the unique fc-tuple of equal elements in x. The 
key vertices of the learning graph are V\ U • • • U Vk, where V s , for s e [fc], consists of all (fc — l)-tuples 
S = (Si, . . . , Sk—i) of pairwise disjoint subsets of [n] of the following sizes. For V s , we require that 
I Si I = Ti + 1 for i < s, and |Sj| = r i for i > s. 

Again, a vertex R = (R±, . . . ,Rk-i) £ V\ completely specifies the internal randomness. We assume 
that, for any R £ Vi, an arbitrary order ti, . . . , t r of the elements in (J R = R\ U ■ ■ • U Rk-i is fixed so 
that all elements of Ri precede all elements of Ri+i for alH < fc — 2. (Here r = n.) We say R G Vi 
is consistent with x if {a%, • ■ • , at} fl (\J R) = 0. 

For each x € / _1 (1), there are exactly ( ri n_ r ^ ) choices of R € V\ consistent with x. We take each 

of them, in the sense of Section [2~31 with probability q — ( ) . Here we use notation 

/ N \ = (N\(N-b 1 \ (N-h 6i_i\ 

{b 1 ,...,bj {bj{ b 2 )"'[ hi J' 

For a fixed input x and fixed randomness R € V% consistent with x, the elements are loaded in the 
following order: 

ti, t 2 , ■ ■ ■ , t r , t r+ i — a\,t r+ 2 = 0,2, ■ ■ ■ , t r+ k = ttfc. (8) 

We use a similar convention to name the vertices and the arcs of the learning graph as in Theorem [7j 
The non-key vertices of Q are of the form v = (RPi {t±, . . . , if}, R), where R € V\, < I < r, and {ti} are 
from flSJ). Here we use notation RC\T = (i?i fl T, . . . , Rk-i n T). The first element of the pair describes 
the set of loaded elements. 

8 Actually, this is an overkill: as we will see from the proof, it is enough for our algorithm to assume there are at most 
0(n) pairs of equal elements in the input, that is a weaker assumption. 
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Let us describe the arcs A*j of Q, where, again, j is the variable the arc loads, and v is the vertex of 
Q it originates in. The arcs of the stages I.s have v = (R D {ti, . . . , fy}, R) and j — tt+x with < £ < r. 
The arc belongs to stage I.s iff te + i G R s . The arcs of stage II. s have v — S, with S &V S , and j £ 1J 5*. 

For a fixed x G / (1) and fixed internal randomness R G V\ consistent with x, the following arcs 
are taken: 



A 



(fin{ti,. ..,*{},«) 



for < I < r 



and 



AR[a 1: 



with < I < k. 



(9) 



Here 



i?[ai,a 2 , 



i] = (R 1 {J{a 1 },R 2 U{a 2 }...,R i U{a i },Rt +1 ,...,R k - 1 ). 



We say x satisfies all these arcs. Note that, for a fixed x, no arc is taken for two different choices of R. 

Again, for each arc Aj, we assign a matrix AJ y 0, so that Xj in ([T]) are given by Xj — ^2 v Xj. 
Assume Aj is fixed. Let S = (Si . . . , Sk-i) be the set of loaded elements. Define an assignment on S as 
a function a: [J S — > [m] U {*}, where -k represents the covered elements of stages I.s for s > 1. Thus, 
a must satisfy * ^ a(Si) and a(Si+±) C a(S'i) U {*} for 1 < i < k — 2. An input z g [m] n satisfies 
assignment a iff, for each t€\JS, 



a(t) 



Zt, t g ^i; or i g Si for i > 1 and z t £ a(Si-i); 
•k, otherwise. 



Each input z satisfies unique assignment on S. Again, we say inputs x and y agree on S, if they satisfy 
the same assignment on S. 

We define AJ as J2 a ^ a wnere the sum is over ai l assignments a on S. The definition of Y a depends 
on whether A" is on stage I.s with s > 1, or not. If A" is not on one of these stages then Y a = qipip* 



3 

where, for each z € [m] n , 



l/y/w, f(z) — 1, and z satisfies a and the arc A^; 
y/w, f(z) = 0, and z satisfies a; 
0, otherwise. 



Here w is a positive real number: the weight of the arc. It only depends on the stage of the arc, and will 
be specified later. Thus, AJ consists of the blocks of the following form: 





X 


y 


X 


q/w 


q 


y 


q 


qw 



(10) 



Here x and y represent inputs mapping to 1 and 0, respectively, all satisfying some assignment a. The 
inputs represented by x have to satisfy the arc Aj as well. 

If Aj is on stage I.s with s > 1, the elements having a match in S s -i and the ones that don't must 



be treated differently. In this case, Y a = q(ipip* 



where 



0, 



f(z) = 1, zj e a(S , s -i), 
and z satisfies a and A-; 

f(z) = 0, and z satisfies a; 
otherwise; 



1, Z-i 



a(S s -i), 



' * °' and z satisfies a and A": 



0. 



f(z) = 0, z 3 g a(S.-i), 
and z satisfies a; 
otherwise. 



Here Wo and wi are again parameters to be specified later. In other words, AJ consists of the blocks of 
the following form: 





Xj G a(S s -i) 


Xj £ a(S s -i) 


yj G a(5 s _i) 


yj £ a(S s -t) 


Xj G a(S„_i) 


q/wi 







q 


Xj £ a(S s -i) 





q/w 







yj G a(5 s _i) 


q 


q 


q(w + wi) 


qwi 


yj £ a{S„-i) 


q 





qwi 


qwi 



(11) 



Here x and y are like in PH|) . This is a generalization of the construction from Theorem [7J Note that if 
Xj and yj are both represented by * in the assignments on (Si, . . . , S s _i, S s U {j}, S s +i, • • • , Sfe_i) they 
satisfy then X v Ax,y\ = 0. 
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5.2 Complexity 

Let us estimate the complexity of the learning graph. We use the notion of the complexity of a stage 
from Section l4~2l 

Let us start with stage 1.1. We set w = 1 for all arcs on this stage. There are ri( " rfc ) arcs on 
this stage, and, by f| 10[) . each of them contributes at most q to the complexity of each z £ {0, 1}™. Hence, 
the complexity of stage LI is O (qri^ " rfc jj = O(ri). 

Now consider stage II. s for s £ [fc]j^| The total number of arcs on the stage is (n — r — s + 
l)( ri+1 r ™ +1 r Tk )■ By l|10p. each of them contribute qw to the complexity of each y £ / _1 (0). 

Out of these arcs, for any x £ / -1 (1), exactly ( ™~ r ) satisfy x. And each of them contribute q/w to 

the complexity of x. Thus, the complexities of stage II. s for any input in / _1 (0) and are 



3 + 1)1 71 )qw = 



ri + 1, . . . ,r s _i + l,r s , . . . ,r k -i J \ r i ' ' 1 r s-l 



and 



n — k \ q 1 
rx,...,r k -i) w io' 

respectively. By setting tu = (n s /(ri • • -r s _i)) -1 ^ 2 , we get complexity O (y/n*/ {r~i • • • r s -i)j of stage 
U.S. The maximal complexity is attained for stage II. k. 

Now let us calculate the complexity of stage I.s for s > 1. The total number of arcs on this stage is 
r s ( n n rk J • Consider an input z £ [m] n , and a choice of the internal randomness R = (Ri , . . . , Rk-t) £ 
V\. An element j is uncovered on stage I.s for this choice of R if and only if there is an s-tuple (&i, . . . , b s ) 
of elements with j = b s such that hi £ Ri and = Zf lj for all i,j £ [s\. By our assumption on the 
uniqueness of a fc-tuple of equal elements in a positive input, the total number of such s-tuples is 0(n). 
And, for each of them, there are r +1 r k J choices of R £ V\ such that 6j £ Ri for all 

i £ [s]. By (HU, the complexities of this stage for an input in / _1 (0) and in are, respectively, at 

most 



<l 

and 



n — s 




n \ 


r s - l,r s+1 ,. 


I w + r s [ 




■,r k -ij \n, 


■ ■,r k -ij _ 



0(n) 



, n — s \ 1 / n \ 1 

0(n)[ — +r s — 

ri - 1, . . . , r s - 1, r s+ i, . . . , r fe _i/ w x V n, . . . , r fe _i/ w 



O 



rx---r s , r s 



71 s 1 Wl Wq 



By assigning wq = ^/n s 1 /(ji ■ ■ ■ r s _i) and lUi = \Jr\ - • • r s _i/n s r , both these quantities become 
O (r s y/r~i ■ ■■r s ^ 1 /n s - 1 ^j . 

With this choice of the weights, the value of the objective function in dial) is 



0^ + ^ + ... + ^™ + y_^_,. ,12) 

Assuming all terms in (|12j) except the last one are equal, and denoting p t — log„ fj, we get that 

1 7 — 1 1 

Pi + H h 2~ = + 2 ( ' Pl + h ~ 2' for 7 = 1, . . . , fc - 2; 

or, equivalently, 

Pi+i = — r^ 1 ' for 7 = 1, . . . ,fc - 2. 



9 The complexities of stages 1.1 and II. s can be explained by a similar argument like in Section 12.31 For stage LI, the 
length is n, and the speciality is O(l). For stage II. s, the length is 1, but the speciality is 0(n s /(ri ■ ■ - r B —i)), because 
there are s marked elements involved, giving 0(n s ), but a^, for i < s, is hidden in Si of size r;, hence, the speciality gets 
divided by r\ ■ ■ ■Ta—l- This argument works, because the "flow" is symmetric (the (x, y)-entries of XJ are either or q) 
as highlighted in Section [3] 
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Assuming the first term, r± , equals the last one, 



V s 



n 2L . . . we get 



Pi = 



l + (l-pl) + ." + (l-p k _l) _ 1 



2 2 



+ 



( 



2 



1 



+ 



+ 



2 fc-i 



1 



From here, it is straightforward that pi = 1 — 2 fe 2 /(2 fe — 1), hence, the complexity of the algorithm is 



5.3 (In) feasibility 

Assume x and y are inputs such that f(x) — 1 and f(y) — 0. Let R = (R\, . . . , Rk-i) S Vi be a choice 
of the internal randomness consistent with x. Similarly to the proof of Theorem [3 let Zj be the matrix 
corresponding to the arc loading j that is taken for input x and randomness R (i.e., the one from ^ 
with sub- index j, or the zero matrix, if there are none). 

Again, we would like to prove that © holds. Unfortunately, it doesn't always hold. Assume x, y and 
R € V\ are such that x and y agree on R. Thus, the contribution to Q is from all arcs of stages l.s. 
Now assume that x ai = y ai and there exists b € R% such that j/j = x ai . This doesn't contradict that x 
and y agree on R, because j/f, is represented by * in the assignment it satisfies on R. 

But x and y disagree on R[ai], because yb gets uncovered there. Thus, the contribution to ([7]) is 
from all arcs of stages II. s as well. Thus, equation ([7J doesn't hold. We deal with this problem in the 
next section. 

6 Final version 

In Section [5.31 we saw that the learning graph in Table [5] is incorrect. This is due to faults. A fault is an 
element b of Ri with i > 1 such that y& = x ai ■ This is the only element that can suddenly uncover itself 
when adding a.;„i to R4-1 on stage II. (i — 1), because we have assumed x contains a unique fc-tuple of 
equal elements, hence, if R G V\ is consistent with x, no b in (J R satisfies x^ = x ai 

But since y is a negative input, there are at most k — 1 = 0(1) faults for every choice of x. Thus, 
all we need is to develop a fault-tolerant version of the learning graph from Table [2] that is capable of 
dealing with this number of faults. 

As an introductory example, consider case k = 3. In this case, a fault may only occur in R2. A fault 
may come in action only if y ai — x ai , hence, we may assume there are at most k — 2 faults in any y. 
Split i?2 into k — 1 subsets {i?2(^)}de[fc-i] ■ We know that at least one of them is not faulty, but it is not 
enough: we have to assure the contribution from these arcs is q exactly, no matter how many of R2{d) 
are faulty, i.e., a variant of ([7]). We achieve this by splitting Ri into 2 fc ~ 1 — 1 parts {Ri(D)} labeled 
by non-empty subsets D of [k — 1], We uncover an element in R2(d) if and only if it has a match in 
R\(D) for some D 3 d. By adding a% to R%(D), we can test whether \J deD Rz{d) contains a fault. This 
is enough to guarantee ([JJ by an application of the inclusion-exclusion principle. The construction in 
Section [6. II is a generalization of this idea for arbitrary k. 

6.1 Construction 

The key vertices of the learning graph are V% U • • • U Vfe, where V s consists of all collections of pairwise 
disjoint subsets S = [Si(di,d2, . ■ . , di-i,D)) labeled by i e [k — 1], dj e [k — j], and C D C [ft — i]. 
There are additional requirements on the sizes of these subsets. 

For a non-empty subset D C N, let /i(-D) denote the minimal element of D. (Actually, any fixed 
element of D works as well.) For each sequence (D\, . . . , D s _i), where Di is a non-empty subset of [fc — i], 
let (.Di, . . . , D s _i) consist of all collections c^, • ■ • , di-i, -D)) such that 



10 In fact, this is not a problem even without this assumption. We may adjudge that elements in x equal to x ai are 
represented by * in the assignments. This justifies Footnote [8] 





U + 1, i < s, efi = ju(-Di), • • ■ , = ^(A-i), and D = A; 
fj, otherwise. 
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Finally, let V s be the union of V s {D ll . . . , D s -\) over all choices of {D\, . . . , D s _x). 

Again, a vertex in R — (Ri(di,d 2 , ■ ■ . , <ii_i,Z?)) £ V\ completely specifies the internal randomness. 
For each of them, we fix an arbitrary order t\, . . . , t r of elements in IJ R so that all elements of Ri precede 
all elements of R4+1 for alH < k — 2. We say R is consistent with x, if {ai, . . . , ak} is disjoint from (J R. 
Let g be the inverse of the number of R £ V\ consistent with x. (Clearly, this number is the same for all 
choices of x.) 

The elements still are loaded in the order from (JSJ . We use a similar convention to name the arcs of the 
learning graph as in Section[S] Arcs of stages I.s are of the form j^ r ^- t ' L >—' tl ' tR ^ f or R £ Vi and < £ < r. 
Here, Rf]T = (Si (di, d 2 , . . . , d^i,D)) is defined by S l (d 1 ,d 2 , ... , <k-i,D) = R l (d 1 ,d 2 , . . . , d i - 1 ,D)nT. 
Arcs of stage II. s are of the form with R £ V s and j ^ (J R. 

For any x £ / -1 (1) and R £ Vi consistent with x, the following arcs are taken. On stage I.s, for 
s £ [k — 1], these are arcs _,4|^'(* 1 ' - ">* / L- R )^ w j lere belongs to one of R s . On stage II. s, for s £ [k], 
we have many arcs loading a s . For each choice of (Z3j)j g r s _x] where Z?i is a non-empty subset of [k — i], 
the arc ^4^J- Dl<_ai >— >-D«-i«-a,_i] j g ^ a j ten w h ere 01, . . . , D s -i ^— = (^(di, d 2 , d,-_i, £>)) 

is defined as follows: 



Si(di, . . .,di-i,D) 




-i,D) U {ai}, i < s, di = /x(-Di), . . . , dj_i = jti(A-i), and D = Di 
-i,D), otherwise. 



Again, for each arc j4J, we define a positive semi-definite matrix XJ so that X; in |T]) are given by 
J2 V Xj. Fix an arc A v ^ and let 5 = (Sj(di, c?2, . . . , dj_i, I?)) be the set of loaded elements. This time, 
we define an assignment on S as a function a: [J S — » [to] U {*} such that * ^ (J D a(Si (£>)), and, for 
alH > 1 and all possible choices of di, . . . , di_i and D: 

a(Si(d u d 2 ,...,di-i,D)) C {*} U |J a(5i_i(di, . . . , di_ 2 , #))■ 

An input z € [m] n satisfies assignment a iff, for each t 6 [J S, 

Zt, t £ Si(D) for some D; 
a (t) = \z t , t £ Si(di, . . .,di-!,D) and z t £ \J K3di l a(S i ^ 1 (d 1 , . . .,di- 2 ,K)); 
*, otherwise. 

Again, we say inputs x and y agree on S, if they satisfy the same assignment a. An example of this 
construction may be found in Figure [5J 

Like before, we define XJ as Y a where the sum is over all assignments a on S. For the arcs on 
stage 1.1, Y a are defined as in (flQ)) . and the arcs on stage I.s, for s > 1, are defined as in (fTT|) with 
a(5 s _i) replaced by Uif 3 <i s _ 1 &(S s -i(di, • • • ,4-2,-^0)- 

Now consider stage II. s. Let Aj be an arc with S £ V s (Di, . . . , Z> s _i). In this case, Yq, = g^'V'* where 

{l/v 7 ^ /(z) = 1, and z satisfies a and the arc A^; 
yjw, f(z) = 0, z satisfies a, and s + \Di \ + • • • + |-D s -i| is odd; 
— y/w, f(z) = 0, z satisfies a, and s + \Di \ + ■ ■ ■ + |£> s -i| is even; 
0, otherwise. 

Thus, depending on the parity of s + \Di \ H h |Z? a _i|, consists of the blocks of one of the following 

two types: 





X 


y 






X 


y 


X 


q/w 


q 


or 


X 


q/w 


-q 


y 


q 


qw 




y 


-q 


qw 



(13) 



Complexity Before we go on proving the correctness of this modified learning graph, let us consider 
the complexity issue. The complexity analysis follows the same lines as in Section 15.21 The complexity 
of stages I.s is proved similarly, by taking Ri = (J d ^ D Ri(d\, . . . , D), and noting that \Ri\ = 
0(kl)ri = 0(ri). Of course, having a match in Rt-\ is not sufficient for an element in Ri to be uncovered, 
but this only reduces the complexity. The analysis of stage II. s is also similar, but this time instead of 
one arc loading clement a s for a fixed choice of x and R £ V\, there are 2°( k ) = O(l) of them. 
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|S 1 ({1,2,3}): 
^3} 

S 1 ({3» 



^ S,(1,{1» 
H S,(1,{1,2» 
- 1 S,(1.{2» 



» S a (1,1,{1» 
» S a (1,2,{1» 



^ S,(2,{1» 
» |S,(2,{1,2}K 
* t S ? (2,{2}) 



» S 3 (2,1,{1}) 



» S 3 (2,2,{1}) 



^ %(3,{1>r 
^ S,ff,{1,2ff 
- 1 S ? (3,{2» 



■ » S,(3,1,{1}) 
» S 3 (3,2,{1» 



Figure 2: A structure of a vertex of a learning graph for 4-distinctness. The vertex belongs to 
V2({2, 3}, {1, 2}). If there is an arrow between two subsets, a match in the first one is enough to uncover 
an element in the second one. After a\ is added to Si ({2, 3}) and 02 is added to S2(2, {1, 2}), x and y 
disagree if there is a fault in one of the hatched subsets. 



6.2 Feasibility 

Fix inputs x £ and y £ / _1 (0), and let R £ V\ be a choice of the internal randomness consistent 

with x. Compared to the learning graph in Section [SJ for a fixed j £ [n], many arcs of the form A v ^ may 
be taken, thus, we have to modify the Zj notation. Let Z be the set of arcs taken for this choice of x 
and R. The complete list is in Section 16.11 We prove that 

X]{x,y]=q. (14) 

Since, again, no arc is taken for two different choices of R £ V\, this proves feasibility (llb|) . 

If x and y disagree on R then (TTJJ holds. The reason is similar to the proof of Theorem [7] Again, it 
is not hard to check that there exists i £ [r] such that x and y disagree on R n {ti, . . . , ty} if and only if 
i' > i. Let j = ti, T = {h,..., U-!}, S = RC\T and S' = Rn{TL){j}). We claim that xf' R) [x, yj = q 
and Xj 7^ yj. 

Indeed, let a be the assignment x and y both satisfy on S, and let a x and a y be the assignments 
x and y, respectively, satisfy on S' . By the order imposed on the elements in ©, we get that a(t) = 
&x(t) = CK y (t) for all t £ T. Since x and y disagree on 5", it must hold that a x (j) ^ a y (j). Hence, 
Xj 7^ yj, and at least one of the is not represented by * in the assignment on S' . Thus, X^ S ' R ^\x, y\ = q 

by pop or (JTTJ) , in dependence on whether A^ R ^ belongs to stage 1.1 or not. 

We claim the contribution to the sum in (|14|) from the arcs in Z loading ty for i' £ [r + k] \ {i} is 
zero. For i' > i, this follows from that x and y disagree before loading ty. Now consider i' < i. Inputs 
x and y agree on S = R n {t\, . . . , ti'}. Let j' = and a be the assignment x and y both satisfy on S. 
We have either Xj> = yj/, or they both are represented by * in a. In both cases, the contribution is zero 
(in the second case, by (fTl"]) ). 

Now assume x and y agree on R. The contribution to (fT4")) from the arcs of stages I.s is by the 
same argument as in the previous paragraph. Let s be the first element such that x ag 7^ y as . We claim 
that if s' 7^ s, the contribution to (|T4)) from the arcs ; £ Z with S £ V s > is 0. 

Indeed, if s' < s then x as , — Va„i- If s ' > s i f° r each choice of (fi)ie[s'-i]i x an( i V disagree on 
R[Di <— ai, . . . , -D s '-i <— a s '-i], because, by construction, all <jj with i < s' are uncovered in the 
assignment of x. 



16 



The total contribution from the arcs Af £ Z with 5 € V s is q. This is a special case of Lemma O 
below. Before stating the lemma we have to introduce additional notations. For a vertex 5 = R[D\ 
oi,.. . ,Dt <— at] of the learning graph with £ < s, let the block on this vertex be defined as the set of 
vertices 

B(S) = {R{Di <- a u . . . , D s _i <- <z s _i] | C A C [fc - i] for i = I + 1, . . . , s - 1} . 

Also, define the contribution of the block on this vertex as C(S) = Yls'eB(S) -^"a„ l x > v\- We prove the 
following lemma by induction on s — I: 

Lemma 9. Let R and s be as above. If x and y agree on S = R[D\ -s— a±, ... ,Dg <— at] then the 
contribution from the block on S is (— l^+l-^H ^\ De >q. Otherwise, it is 0. 

Note that if £ = 0, the lemma states that the contribution of the block on R is q. But this block 
consists of all arcs of the form from Z. Thus, this proves (|14p. 

Proof of Lemma\Q If x and y disagree on 5 1 , they disagree on any vertex from the block, hence, the 
contribution is 0. 

Now assume x and y agree on 5. If t = s — 1, there is only S in the block. Hence, the contribution is 
(— l) £ +l- Dl H n n *\q by (fT5|) . because x and y agree on S and x as 7^ y as . Now assume £ < s ~ 1, and the 
lemma holds for £ replaced by t + 1. The block 0(5) can be expressed as the following disjoint union: 

B(S)= |J B(R[D 1 <-ai,...,D t <-at,Dt +1 <-a t+1 ]). 

0CX>/+iC[fc-<-l] 

Let / be the set of i e [fc — ^ — 1] such that (J D i?^+2(/x(Di), . . . , fj,(De),i, D) does not contain a fault. 
It is not hard to see that x and y agree on R[D\ <— 01, . . . , A+i a^+i] if and only if D^+i C /. 
Since y ai = ■ ■ ■ = 2/ 0s _i = x a x and there is at most k — 1 element in y equal to x ai , there are at most 
k — 1 — (s — 1) < fc — £ — 1 faults. Hence, / is non-empty. Using the inductive assumption, 

C(5)= C(R[D 1 <-a 1 ,...,D i i-at,Dt +1 <-ai +1 ]) 

0C-D«+iC[fc-£-l] 
tlCDe+iCI 

by inclusion-exclusion. □ 

7 Conclusion 

A quantum query algorithm for /c-distinctness is presented in the paper. The algorithm uses the learning 
graph framework. The improvement in complexity is due to a sequence of new ideas enhancing the 
framework: partial assignments in the vertices of the learning graph, arcs with the weight dependent on 
the variable being loaded, fault-tolerant learning graphs, and others. 

The future research may concentrate on the following problems. Is it possible to use some of these 
ideas to improve the quantum query complexity of other problems? The complexity of the algorithm 
in the paper has rather bad dependence on k. Is it possible to improve the dependence using a more 
advanced fault-tolerance technique? Finally, we know that the Ambainis' algorithm can be implemented 
time-efficiently. Is this true for the algorithm in this paper? 
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